Despite the immense success of neural networks in modeling system dynamics from data, they often remain physics-agnostic black boxes. In the particular case of physical systems, they might consequently make physically inconsistent predictions, which makes them unreliable in practice. In this paper, we leverage the framework of Irreversible port-Hamiltonian Systems (IPHS), which can describe most multi-physics systems, and rely on Neural Ordinary Differential Equations (NODEs) to learn their parameters from data. Since IPHS models are consistent with the first and second principles of thermodynamics by design, so are the proposed Physically Consistent NODEs (PC-NODEs). Furthermore, the NODE training procedure allows us to seamlessly incorporate prior knowledge of the system properties in the learned dynamics. We demonstrate the effectiveness of the proposed method by learning the thermodynamics of a building from the real-world measurements and the dynamics of a simulated gas-piston system. Thanks to the modularity and flexibility of the IPHS framework, PC-NODEs can be extended to learn physically consistent models of multi-physics distributed systems.
translated by 谷歌翻译
Generic Object Tracking (GOT) is the problem of tracking target objects, specified by bounding boxes in the first frame of a video. While the task has received much attention in the last decades, researchers have almost exclusively focused on the single object setting. Multi-object GOT benefits from a wider applicability, rendering it more attractive in real-world applications. We attribute the lack of research interest into this problem to the absence of suitable benchmarks. In this work, we introduce a new large-scale GOT benchmark, LaGOT, containing multiple annotated target objects per sequence. Our benchmark allows researchers to tackle key remaining challenges in GOT, aiming to increase robustness and reduce computation through joint tracking of multiple objects simultaneously. Furthermore, we propose a Transformer-based GOT tracker TaMOS capable of joint processing of multiple objects through shared computation. TaMOs achieves a 4x faster run-time in case of 10 concurrent objects compared to tracking each object independently and outperforms existing single object trackers on our new benchmark. Finally, TaMOs achieves highly competitive results on single-object GOT datasets, setting a new state-of-the-art on TrackingNet with a success rate AUC of 84.4%. Our benchmark, code, and trained models will be made publicly available.
translated by 谷歌翻译
Learned Bloom Filters, i.e., models induced from data via machine learning techniques and solving the approximate set membership problem, have recently been introduced with the aim of enhancing the performance of standard Bloom Filters, with special focus on space occupancy. Unlike in the classical case, the "complexity" of the data used to build the filter might heavily impact on its performance. Therefore, here we propose the first in-depth analysis, to the best of our knowledge, for the performance assessment of a given Learned Bloom Filter, in conjunction with a given classifier, on a dataset of a given classification complexity. Indeed, we propose a novel methodology, supported by software, for designing, analyzing and implementing Learned Bloom Filters in function of specific constraints on their multi-criteria nature (that is, constraints involving space efficiency, false positive rate, and reject time). Our experiments show that the proposed methodology and the supporting software are valid and useful: we find out that only two classifiers have desirable properties in relation to problems with different data complexity, and, interestingly, none of them has been considered so far in the literature. We also experimentally show that the Sandwiched variant of Learned Bloom filters is the most robust to data complexity and classifier performance variability, as well as those usually having smaller reject times. The software can be readily used to test new Learned Bloom Filter proposals, which can be compared with the best ones identified here.
translated by 谷歌翻译
Feature selection is of great importance in Machine Learning, where it can be used to reduce the dimensionality of classification, ranking and prediction problems. The removal of redundant and noisy features can improve both the accuracy and scalability of the trained models. However, feature selection is a computationally expensive task with a solution space that grows combinatorically. In this work, we consider in particular a quadratic feature selection problem that can be tackled with the Quantum Approximate Optimization Algorithm (QAOA), already employed in combinatorial optimization. First we represent the feature selection problem with the QUBO formulation, which is then mapped to an Ising spin Hamiltonian. Then we apply QAOA with the goal of finding the ground state of this Hamiltonian, which corresponds to the optimal selection of features. In our experiments, we consider seven different real-world datasets with dimensionality up to 21 and run QAOA on both a quantum simulator and, for small datasets, the 7-qubit IBM (ibm-perth) quantum computer. We use the set of selected features to train a classification model and evaluate its accuracy. Our analysis shows that it is possible to tackle the feature selection problem with QAOA and that currently available quantum devices can be used effectively. Future studies could test a wider range of classification models as well as improve the effectiveness of QAOA by exploring better performing optimizers for its classical step.
translated by 谷歌翻译
在这项工作中,我们解决了4D面部表情生成的问题。通常,通过对中性3D面动画来达到表达峰,然后回到中立状态来解决这一问题。但是,在现实世界中,人们表现出更复杂的表情,并从一个表达式转换为另一种表达。因此,我们提出了一个新模型,该模型在不同表达式之间产生过渡,并综合了长长的4D表达式。这涉及三个子问题:(i)建模表达式的时间动力学,(ii)它们之间的学习过渡,以及(iii)变形通用网格。我们建议使用一组3D地标的运动编码表达式的时间演变,我们学会通过训练一个具有歧管值的gan(Motion3dgan)来生成。为了允许生成组成的表达式,该模型接受两个编码起始和结尾表达式的标签。网格的最终顺序是由稀疏的2块网格解码器(S2D-DEC)生成的,该解码器将地标位移映射到已知网格拓扑的密集,每位vertex位移。通过明确处理运动轨迹,该模型完全独立于身份。五个公共数据集的广泛实验表明,我们提出的方法在以前的解决方案方面带来了重大改进,同时保留了良好的概括以看不见数据。
translated by 谷歌翻译
主动推断是源自计算神经科学的数学框架。最近,它被证明是在机器人技术中构建目标驱动行为的一种有前途的方法。具体而言,主动推理控制器(AIC)在多个连续控制和国家估计任务方面取得了成功。尽管取得了相对成功,但一些建立的设计选择导致了机器人控制的许多实际限制。这些包括对国家的偏见估计,以及仅是控制动作的隐式模型。在本文中,我们强调了这些局限性,并提出了无偏见的活动推理控制器(U-AIC)的扩展版本。U-AIC保持AIC的所有引人注目的好处,并消除其局限性。在2多臂臂上的仿真结果和对真正的7-DOF操纵器的实验表明,相对于标准AIC,U-AIC的性能提高了。该代码可以在https://github.com/cpezzato/unbiased_aic上找到。
translated by 谷歌翻译
更改检测的目的(CD)是通过比较在不同时间拍摄的两张图像来检测变化。 CD的挑战性部分是跟踪用户想要突出显示的变化,例如新建筑物,并忽略了由于外部因素(例如环境,照明条件,雾或季节性变化)而引起的变化。深度学习领域的最新发展使研究人员能够在这一领域取得出色的表现。特别是,时空注意的不同机制允许利用从模型中提取的空间特征,并通过利用这两个可用图像来以时间方式将它们相关联。不利的一面是,这些模型已经变得越来越复杂且大,对于边缘应用来说通常是不可行的。当必须将模型应用于工业领域或需要实时性能的应用程序时,这些都是限制。在这项工作中,我们提出了一个名为TinyCD的新型模型,证明既轻量级又有效,能够实现较少参数13-150x的最新技术状态。在我们的方法中,我们利用了低级功能比较图像的重要性。为此,我们仅使用几个骨干块。此策略使我们能够保持网络参数的数量较低。为了构成从这两个图像中提取的特征,我们在参数方面引入了一种新颖的经济性,混合块能够在时空和时域中交叉相关的特征。最后,为了充分利用计算功能中包含的信息,我们定义了能够执行像素明智分类的PW-MLP块。源代码,模型和结果可在此处找到:https://github.com/andreacodegoni/tiny_model_4_cd
translated by 谷歌翻译
在对象跟踪和状态估计问题中,诸如不精确测量和缺乏检测之类的模棱两可的证据可以包含有价值的信息,因此可以利用以进一步完善概率信念状态。特别是,可以利用有关传感器有限视野的知识,以结合观察到对象的位置的证据。本文提出了一种系统的方法,用于结合视野几何形状,位置以及对象包含/排除证据中的知识,并将其纳入对象状态密度和随机有限设置的多对象基础性分布中。最终的状态估计问题是非线性的,并使用基于递归成分拆分的新的高斯混合物近似来解决。基于此近似,在跟踪问题中仅使用自然语言语句作为输入来得出并证明一种新型的高斯混合物Bernoulli滤波器,以进行不精确的测量。本文还考虑了代表性选择的多对象分布的界面视野和基数分布之间的关系,该分布可用于传感器计划,这是通过涉及多重bernoulli过程的问题所证明的,最多可用于一个。 - 五百个潜在的对象。
translated by 谷歌翻译
腔内器官内部的导航是一项艰巨的任务,需要在操作员的手的运动与从内窥镜视频获得的信息之间进行非直觉的协调。开发自动化某些任务的工具可以减轻干预期间医生的身体和精神负担,从而使他们专注于诊断和决策任务。在本文中,我们提出了一种用于腔内导航的协同解决方案,该解决方案由3D打印的内窥镜软机器人组成,该机器人可以在腔内结构内安全移动。基于卷积神经网络(CNN)的Visual Servoing用于完成自主导航任务。 CNN经过幻象和体内数据训练以分割管腔,并提出了一种无模型的方法来控制受约束环境中的运动。在不同的路径配置中,在解剖幻像中验证了所提出的机器人。我们使用不同的指标分析机器人的运动,例如任务完成时间,平滑度,稳态中的误差以及均值和最大误差。我们表明,我们的方法适合在空心环境和条件下安全导航,这些环境和条件与最初对网络训练的条件不同。
translated by 谷歌翻译
深度学习的高级面部识别以实现前所未有的准确性。但是,了解面部的本地部分如何影响整体识别性能仍然不清楚。除其他外,面部掉期已经进行了实验,但只是为了整个脸。在本文中,我们建议交换面部零件,以剥夺不同面部零件(例如眼睛,鼻子和嘴巴)的识别相关性。在我们的方法中,通过拟合3D先验来交换从源面转换为目标的零件,该零件在零件之间建立密集的像素对应关系,同时还要处理姿势差异。然后,无缝克隆用于在映射的源区域和目标面的形状和肤色之间获得平滑的过渡。我们设计了一个实验协议,该协议使我们能够在通过深网进行分类时得出一些初步结论,表明眼睛和眉毛区域的突出性。可在https://github.com/clferrari/facepartsswap上找到代码
translated by 谷歌翻译